22 research outputs found

    Multi-party Poisoning through Generalized pp-Tampering

    Get PDF
    In a poisoning attack against a learning algorithm, an adversary tampers with a fraction of the training data TT with the goal of increasing the classification error of the constructed hypothesis/model over the final test distribution. In the distributed setting, TT might be gathered gradually from mm data providers P1,…,PmP_1,\dots,P_m who generate and submit their shares of TT in an online way. In this work, we initiate a formal study of (k,p)(k,p)-poisoning attacks in which an adversary controls k∈[n]k\in[n] of the parties, and even for each corrupted party PiP_i, the adversary submits some poisoned data Tiβ€²T'_i on behalf of PiP_i that is still "(1βˆ’p)(1-p)-close" to the correct data TiT_i (e.g., 1βˆ’p1-p fraction of Tiβ€²T'_i is still honestly generated). For k=mk=m, this model becomes the traditional notion of poisoning, and for p=1p=1 it coincides with the standard notion of corruption in multi-party computation. We prove that if there is an initial constant error for the generated hypothesis hh, there is always a (k,p)(k,p)-poisoning attacker who can decrease the confidence of hh (to have a small error), or alternatively increase the error of hh, by Ξ©(pβ‹…k/m)\Omega(p \cdot k/m). Our attacks can be implemented in polynomial time given samples from the correct data, and they use no wrong labels if the original distributions are not noisy. At a technical level, we prove a general lemma about biasing bounded functions f(x1,…,xn)∈[0,1]f(x_1,\dots,x_n)\in[0,1] through an attack model in which each block xix_i might be controlled by an adversary with marginal probability pp in an online way. When the probabilities are independent, this coincides with the model of pp-tampering attacks, thus we call our model generalized pp-tampering. We prove the power of such attacks by incorporating ideas from the context of coin-flipping attacks into the pp-tampering model and generalize the results in both of these areas

    Bounding Training Data Reconstruction in DP-SGD

    Full text link
    Differentially private training offers a protection which is usually interpreted as a guarantee against membership inference attacks. By proxy, this guarantee extends to other threats like reconstruction attacks attempting to extract complete training examples. Recent works provide evidence that if one does not need to protect against membership attacks but instead only wants to protect against training data reconstruction, then utility of private models can be improved because less noise is required to protect against these more ambitious attacks. We investigate this further in the context of DP-SGD, a standard algorithm for private deep learning, and provide an upper bound on the success of any reconstruction attack against DP-SGD together with an attack that empirically matches the predictions of our bound. Together, these two results open the door to fine-grained investigations on how to set the privacy parameters of DP-SGD in practice to protect against reconstruction attacks. Finally, we use our methods to demonstrate that different settings of the DP-SGD parameters leading to the same DP guarantees can result in significantly different success rates for reconstruction, indicating that the DP guarantee alone might not be a good proxy for controlling the protection against reconstruction attacks

    Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation

    Full text link
    Recent works have demonstrated that deep learning models are vulnerable to backdoor poisoning attacks, where these attacks instill spurious correlations to external trigger patterns or objects (e.g., stickers, sunglasses, etc.). We find that such external trigger signals are unnecessary, as highly effective backdoors can be easily inserted using rotation-based image transformation. Our method constructs the poisoned dataset by rotating a limited amount of objects and labeling them incorrectly; once trained with it, the victim's model will make undesirable predictions during run-time inference. It exhibits a significantly high attack success rate while maintaining clean performance through comprehensive empirical studies on image classification and object detection tasks. Furthermore, we evaluate standard data augmentation techniques and four different backdoor defenses against our attack and find that none of them can serve as a consistent mitigation approach. Our attack can be easily deployed in the real world since it only requires rotating the object, as we show in both image classification and object detection applications. Overall, our work highlights a new, simple, physically realizable, and highly effective vector for backdoor attacks. Our video demo is available at https://youtu.be/6JIF8wnX34M.Comment: 25 page
    corecore